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The random assignment problem asks for the minimum-cost perfect matching in the complete n x n bipartite 
, graph K-nn with i.i.d. edge weights, say uniform on [0, 1]. In a remarkable work by Aldous (2001), the optimal 

cost was shown to converge to ^(2) as n ^ oo, as conjectured by Mezard and Parisi (1987) through the so-called 
CO ' cavity method. The latter also suggested a non-rigorous decentralized strategy for finding the optimum, which 

turned out to be an instance of the Belief Propagation (BP) heuristic discussed by Pearl (1987). In this paper we 
' j\ use the objective method to analyze the performance of BP as the size of the underlying graph becomes large. 

Ph , Specifically, we establish that the dynamic of BP on /C„„ converges in distribution as n — > oo to an appropriately 

Qi^ ' defined dynamic on the Poisson Weighted Infinite Tree, and we then prove correlation decay for this limiting 

' dynamic. As a consequence, we obtain that BP finds an asymptotically correct assignment in 0{n^) time only, 

'-pi I This contrasts with both the worst-case upper bound for convergence of BP derived by Bayati, Shah and Sharma 

, (2005) and the best-known computational cost of Q{n'^) achieved by Edmonds and Karp's algorithm (1972). 
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C5 I 1. Introduction. Given a matrix of costs (^ij)i<i.j<n, the assignment problem consists of 

■ determining a permutation tt of {1, . . . , rt} whose total cost X^ILi ^j.Tr(j) minimal. This is equivalent to 
\ finding a minimum-weight complete matching in the n x n complete bipartite graph whose edges are 

Q*^ ■ weighted by the (Xij). Recall that a complete matching on a graph is a subset of pairwise disjoint edges 

r\ I covering all vertices. Here we consider the so-called random assignment problem where the (Xij) are 

■ i.i.d. with cumulative distribution function denoted by H, i.e. H{t) = P^Xij < t). We let /C„„ denote 
the resulting randomly weighted n x n bipartite graph and tt^ its optimal matching. Observe that the 

; ^ ' continuity of is a necessary and sufficient condition for tt^ to be a.s. unique. We are interested in 

' the convergence of the BP heuristic for finding tt^^^ as n increases to infinity. 

1.1 Related Work. Although it seems cunningly simple, the assignment problem has led to rich 
development in combinatorial probability and algorithm design since the early 1960s. Partly motivated 
to obtain insights for better algorithm design, the question of finding asymptotics of the average cost 
of n'^^^^ became of great interest (see [inilllinilllllHlinilSI)- In 1987, through cavity method based 
calculations, Mezard and Parisi [H] conjectured that, for Exponential(l) edge weights. 
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C(2). 
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This was rigorously established by Aldous 2 more than a decade later, leading to the formalism of "the 
objective method" (see survey by Aldous and Steele |4j). In 2003, an exact version of the above conjecture 
was independently established by Nair, Prabhakar and Sharma ^17^ and Linusson and Wastlund 15J. 

On the algorithmic aspect, the assignment problem has been extremely well studied and its consider- 
ation laid foundations for the rich theory of network flow algorithms. The best known algorithm is by 
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Edmonds and Karp [TO] and takes 0{n^) operations in the worst-case for arbitrary instance. For i.i.d. 
random edge weights, Karp [12] designed a special implementation of the augmenting path approach 
using priority queues that works in expected time 0(n^ logn). Concurrently, the statistical physics-based 
approach mentioned above suggested a non-rigorous decentralized strategy which turned out to be an 
instance of the more general BP heuristic, popular in artificial intelligence (see, book by Pearl |19| and 
work by Yedidia, Freeman and Weiss [H]). In a recent work, one of the authors of the present paper. 
Shah along with Bayati and Sharma [6] , established correctness of this iterative scheme for any instance 
of the assignment problem, as long as the optimal solution is unique. More precisely, they showed exact 
convergence within at most [ ^" '"'^'^'■j ^''^ "| iterations, where e denotes the difference of weights between 
optimum and second optimum. This upper bound is always greater than n, and can be shown to scale like 
Q{n^) as n goes to infinity in the random model. Since each iteration of the BP algorithm needs Q{n^) 
operations to be performed, one is left with an upper bound of 0{n'^) for the total computation cost. 
However, simulation studies tend to show much better performances on average than what is suggested 
by this worst-case analysis. 

1.2 Our contribution. Motivated by the above discussion, we consider here the question of deter- 
mining the convergence rate of BP for the random assignment problem. We establish that, for a large 
class of edge- weight distributions, the number of iterations required in order to find an almost optimal 
assignment remains in fact bounded as n — > oo. Thus, the total computation cost scales as 0(jn?) only, in 
sharp contrast with both the worst-case upper bound for exact convergence of BP derived in [5] and the 
Q{n^) bound achieved by Edmonds and Karp's algorithm. Clearly, no algorithm can perform better than 
ri(n^), since it is the size of the input. That is, BP is an asymptotically optimal algorithm on average. 

2. Result and organization. 

2.1 BP algorithm. As we shall see later, the dynamics of BP on A^„„ happens to converge to 
the dynamics of BP on a limiting infinite tree. Therefore, we define the BP algorithm for an arbitrary 
weighted graph G = {V,E). We use notation that the weight of {v,w} E E is ||w,w||g- By w ^ v, we 
denote that w is a neighbor of v in G. Note that a complete matching on G can be equivalently seen as 
an involutive mapping ttg connecting each vertex v to one of its neighbors ttg^v). We shall henceforwards 
use this mapping representation rather than the edge set description. 

The BP algorithm is distributed and iterative. Specifically, in each iteration A: > 0, every vertex v E V 
sends a real- valued message {v ^ w)q to each of its neighbor w ^ v as follows: 

• initialization rule: 

{v^wfa = 0; (1) 

• update rule: 

{v-.w)'+'^ min {\\u,v\\^~{u^v)'a}- (2) 

Based on those messages, every vertex v G V estimates the neighbor ttq{v) to which it connects as follows: 

• decision rule: 

Tr^{v) = argmin{||u,?;||g - (u ^ v)q}. (3) 

When G — /C„„, [Bl ensures convergence of tt^ to the optimum tt)^ as long as the latter is unique, 
which holds almost surely if and only if H is continuous. The present paper asks about the typical rate 
of such a convergence, and more precisely its dependency upon n as n increases to oo. 
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2.2 Result. In order to state our main result, we introduce the normalized Hamming distance be- 
tween two given assignments tt, tt' on a graph G = {V, E) : 



d{TT,Tr') — card|t; G V,tt{v) ^ 7r'(w)|. 



Theorem 2.1 Assume the cumulative distribution function H satisfies: 
Al. Regularity : H is continuous and H'{0^) exists and is non-zero; 
A2. Light-tail property : as t ^ oo, H[t) = 1 — O (e^'^*) for some (3 > 0. 

Then, 



lim sup E 



44„ 



A;— >oo 



0. 



In other words, given any e > 0, there exists k{e),n{e) such that the expected fraction of non-optimal 
row-to-column assignments after fc(e) iterations of the BP algorithm on a random n x n cost array is 
less than e, no matter how large n > n{e) is. Consequently, the probability to get more than any given 
fraction of errors can be made as small as desired within finitely many iterations, independently of n. 
Since each iteration requires O(n^) operations, the overall computation cost scales as 0{n^) only, with 
constant depending on the admissible error. This applies for a wide class of cost distributions, including 
uniform over [0, 1] or Exponential. 

Remark 2.1 It may be the case that the e fraction of wrong row-to-column assignments results in local 
violations of the matching property. Depending on the context of application, this might be quite unsat- 
isfactory. However, such an "e— feasible matching" can easily be modified in order to produce an honest 
matching without substantially increasing the total cost (see /i, Proposition 2] for details). 

2.3 Organization. The remaining of the paper is dedicated to proving Theorem l2.1l Although it is 
far from being an implication of the result by Aldous 2J , it utilizes the machinery of local convergence, 
and in particular the Poisson Weighted Infinite Tree T appearing as the limit of (/Cnn)n>i- These notions 
are recalled in Section [3] The diagram below illustrates the three steps of our proof : Theorem 12.11 
corresponds to establishing the top-horizontal arrow, which is done by establishing the three others. 



Stop 1 (Section O 



Step 3 (Section|6) 



: ^ "r 

k — » oo 
Step 2 fScctionfSl 



1. First (Section S]), we prove that BP's behavior on /C„„ "converges" as n oo to its behavior on 
T. This is formally stated as Theorem 14 . 1 1 and corresponds to the left vertical arrow above. 

2. Second (Section O, we establish convergence of BP on T. This is summarized as Theorem 15.21 
and corresponds to the bottom horizontal arrow in the above diagram. We note that Theorem 
15. II resolves an open problem stated by Aldous and Bandyopadhyay ([3l Open Problem # 62]). 

3. Third (Section [5]), the connection between the fixed point on T and the optimal matching on 
/C„„ is provided by the work by Aldous [2] - corresponding to the vertical right arrow and stated 
as Theorem 16. II We use it to complete our proof. 
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3. Preliminaries. We recall here the necessary franiework mtroduced by Aldous in [2]. Consider a 
rooted, edge-weighted and connected graph G, with distance between two vertices being defined as the 
infimum over all paths connecting them of the sum of edge weights along that path. For any g > 0, define 
the £)— restriction of G as the subgraph [G] g induced by the vertices lying within distance g from the 
root. Call G a geometric graph if [G]^ is finite for every g > 0. 

Definition 3.1 (local convergence) Let G, Gi,G2,... be geometric graphs. We say that (G„)n>i 
converges to G if for every g > such that no vertex in G is at distance exactly g from the root the 
following holds: 

1. 3ng e N s.t. the [G„]g,n > ng are all isomorphi 

S to \G^g ; 

2. The corresponding isomorphisms 7^: [G] g rG„]g,n > Ug can he chosen so that for every edge 
{v,w} in \G'\g: 

" II n-^oo II 11^ 

In the case of labeled geometric graphs, each oriented edge {v,w) is also assigned a label X{v,w) taking 
values in some Polish space A. Then the isomorphisms {^n)n>ne have to moreover satisfy the following: 

3. For every oriented edge (v,w) in \G'\g, Xa„ (Tnl'^); 7^(1^)) ^ (v^w) . 

n — ^00 

The intuition behind this definition is the following: in any arbitrarily large but fixed neighborhood of 
the root, G„ should look very much like G for large n, in terms of structure (part 1), edge weights (part 
2) and labels (part 3). With little work, one can define a distance that metrizes this notion of convergence 
and makes the space of (labeled) geometric graphs complete and separable. As a consequence, one can 
import the usual machinery related to the theory of weak convergence of probability measures. We refer 
the reader unfamiliar with these notions to the excellent book of Billingsley 7 . 

Now, consider our randomly weighted n x n bipartite graph /C„„ as a random geometric graph by 
fixing an arbitrary root, independently of the edge weights. Then the sequence (/Crm)n>i happens to 
converge locally in distribution to an appropriately weighted infinite random tree. Before we formally 
state this result known as the "PWIT Limit Theorem" [21 [3] , we introduce some notations that will be 
useful throughout the paper. We let V denote the set of all finite words over the alphabet N*, the 
empty word, "•" the concatenation operation and for any w G V* = V \ {0}, v the word obtained from v 
by deleting the last letter. We also set £ = {{v,v.i},v S V, i > 1}. The graph T = (V,f) thus denotes 
an infinite tree with as root, letters as the nodes at depth 1, words of length 2 as the nodes at depth 2, 
etc. Now, consider a collection (^^ = £,1,^2 ■ ■ ■)vgv independent, ordered Poisson point processes with 
intensity 1 on M+, and assign to edge {v,v.i} G £ the weight ti.i||.j- = This defines the law of a 
random geometric graph T called the "Poisson Weighted Infinite Tree" (PWIT). 

Theorem 3.1 (Pwit Limit Theorem, Aldous [Illl]) Under assumption Al on H: 

niJ'(0+)/C„„ T, (4) 

n — >QO 

in the sense of local weak convergence of geometric graphs. 

Remark 3.1 To get rid of scaling factors, we will henceforth multiply all edge weights in /C„„ by nH'{0^). 
Observe that both the optimal matching tt^ and BP estimates tt^ ,k>0 remain unaffected. 

^An isomorphism from G = {V,0,E) to G' = {V',0',E'), denoted 7: G ^ G' , is simply a bijection from V to V 
preserving the root {7(0) = 0') and the structure {\/{x,y) £ V, {^{x), jly)} a E' {a^'i J/} £ E). 
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4. First step: convergence to a limiting dynamic as n — y oo. In this section we deduce from 
the PWIT Limit Theorem that the behavior of BP when running on /C„„ "converges" as n — >• oo to its 
behavior when running on T. To turn this idea into a rigorous statement, let us encode the execution 
of BP as labels attached to the oriented edges of the graph. Specifically, given a geometric graph G and 
an integer fc > 0, we define the A:*''— step configuration of BP on G, denoted by (G, (■ ttq), as the 

labeled geometric graph obtained by setting the label of any oriented edge {v, w) in G to be the couple 
{{v — > w)q, Ij^^^rfc (•„)}). We can now state and prove the main theorem of the present section. 

Theorem 4.1 (Continuity of BP) Consider an almost sure realization of the PWIT limit Theorem: 

ICnn > T. (5) 

n — ^oo 

Then for every fixed k > 0, the k^^ — step configuration of BP on /C„„ converges locally in probability to 
the k*'^—step configuration of BP on T : 

(^nn,(---)L„>4„J (r,(.^.)^,4). (6) 

\ nn nn/ n^OO ^ ^ 

Proof. Let us (redundantly) re-label the vertices of /C„,i by words of V in a manner that yields 
to consistent comparison between the messages on /C„„ and those on T. To begin with, let the empty 
word represent the root of /C„„ and words 1, 2, • • • , n its immediate neighbors, ordered by increasing 
weight of the edge connecting them to the root. Then, inductively, if word w G V* represents some vertex 
X e /C„„ and ii some y G /C„„, then let the words u.l, v.2, ■ ■ ■ , i;.(n — 1) represent the n — 1 neighbors of 
X distinct from y in /C„„, again ordered by increasing weight of the corresponding edge. Note that this 
definition makes almost surely sense since the edge weights are pairwise distinct (by continuity of H). 
In fact, it follows from an easy induction on u G V that the vertex represented by v in /C„„ is nothing 
but 7^(w) as soon as g and n are large enough, where 7,^: \T^g ^ [A^nnle the (random) isomorphism 
involved in the definition of the local convergence ([5]). In particular, 

V{w, w} G iS, llw, wIL > (7) 

With this relabeling in hand, the desired convergence © can now be written: 

y{v,w} &£,{v ^w)'^ — ^ (w^w)^ andVwG V,7r^ (w) — ^ 7r^(w). (8) 

n — *oo n — ^00 

The recursive nature of the messages almost compels one to think of proving ([5]) by induction over k. 
The base case of fc = is trivial. However, when trying to go from step k to step fc + 1 one soon gets 
confronted by a major hinder: the update and decision rules ([T]) and ([3]) are not continuous with respect 
to local convergence. Indeed, writing: 

i'"^^)'Knl = |lh'"llK;„„ ~ ^"^ ^ 

u ^ {v .1 , . . . , V . {n — 1) , V } ^ 

u w 



and TT 



L.(^) = argmin - -> v)^\, 

u^{v.l,...,v.(n~l),v} 

one can not simply invoke convergence of each term inside the min and arg min to conclude, because there 
are unboundedly many such terms as n — > 00. Remarkably enough, it turns out that under assumption 
A2, we can in fact restrict ourselves to a uniformly bounded number of them with probability as high as 
desired, as stated in the following lemma. □ 

Lemma 4.1 (Uniform control on essential messages) For all v and k > : 

limsupP I argmin llu, — {v ^ v.i)^ f > io I 0. 

n^oo V l<i<n ^ ""J / 'n^oo 



The proof of this Lemma is long and technical and hence is defered to Appendix [X] 
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5. Second step: analysis of BP on PWIT. In light of Theorem 14. 11 one can replace the asymp- 
totic analysis of BP on /C„„ as n becomes large by the direct study of BP's dynamics on the limiting 
PWIT. Formally, we are interested in the limiting behavior of the random process defined for all v G V* 
by the recursion: 

{v v)^^ = nun {||u, v.i\\^ ~ {v ^ '"■i)T}i (9) 

where the initial values ((t; — > v)T-)y^y i.i.d. random variables independent of T (0 in the case of 
our algorithm). The fact that the above min is a.s. well defined despite the infinite number of terms 
will become clear later (see Lemma [5^. For the time being, it is sufficient to consider it as a M- valued 
infimum. First observe that at any given time k all (v —> G V* share the same distribution, 

owing to the natural spatial invariance of the PWIT. Moreover, if F denotes the corresponding tail 
distribution fmiction at a given time, a straightforward computation (see for instance [2J) shows that the 
tail distribution function TF obtained after a single application of update rule ^ is given by: 

TF : X ^ exp i~ / F{t) dt 

This defines an operator T on the space V of tail distribution functions of R— valued random variables. 
I.e. non-mcreasmg corloU hmctions F: M -> [0, 1]. T is known to have a unique fixed point (see [5]), the 
so-called logistic distribution: 

: X . 

1 -t- 

Our first step will naturally consist in studying the dynamics of T on T>. 

5.1 Weak attractiveness. Finding the domain of attraction of F* under operator T is not known 
and has been listed as an open problem by Aldous and Bandyopadhyay ([31 Open Problem # 62]). In 
what follows, we answer this question and more. We fully characterize the asymptotic behavior of the 
successive iterates {T''F)k>o for any initial distribution F G T>. 

First observe that T is anti-monotone with respect to pointwise order: 

Fi< F2=> TFi > TF2. (10) 

This suggests considering the non-decreasing second iterate T^. Unlike T, T^ admits infinitely many 
fixed points. To see this, let Ot {t e R) be the t— shift operator defined on T> by 9tF: x > F{x — t). Then 
a trivial change of variable gives: 

To9t^9-toT. (11) 
Therefore, it follows that T'^{etF*) = 0t{T^F*) = 9tF* for aU t G R. That is, the 0tF*,t G R are fixed 
points of T^. These considerations lead us to introduce the key tool of our analysis: 

Definition 5.1 For F eV, define the transform F as follows : 

Fix) 



Wx G R, F{x) =x + \n 



1 - F{x) 



Intuitively, F represents the local shift (along the X-axis) between F and F* . Indeed, it enables us to 
express any _F G I? as a locally deformed version of F* via the following straightforward inversion formula: 

Vx, Fix) ^F*ix- Fix)) = 9p^^^F* ix). 

In particular, Ot^F* < F < d^F* ii < F < ia, and F = OtF* if and only if F is constant on R 
with value t. In that sense, the maximal amplitude of the variations of F on R tells something about 
the distance between F and the family of fixed points {6tF*,t G R}. Thus, the action of T on those 
variations appears to be of crucial importance and will be at the center of our attention for the rest of 
this sub-section. We now state three lemmas whose proofs are given in Appendix IB] 

^continuous on the right, limit on the left 
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Lemma 5.1 Let F £V\ {0} such that I F < oo. Then, T^F is bounded on R. 

Jo 

Lemma 5.2 If F is such that F is bounded, then TF is bounded too, and moreover: 

-supP < inf < supTF < - iniF. 

R R R R 

Further, if F is not constant then this contraction becomes strict under a second iteration : 

inf F < inf < sup < sup F. 

R R B R 

Lemma 5.3 Let F ^ V be such that F is bounded. Then, T'^F is continuously differentiate for k > 2, 
and the family of derivatives (T'^F)' , k > 3 is uniformly integrable: 



sup / I {T^F)'{x) I dx > 0. 

fe>3J|a;|>Af M-»oo 



fe>3 J|a;|>Af 

We are now in position to provide a complete description of the dynamics of T on T>. 

Theorem 5.1 (Dynamics of T on V) Let F e T). Assume F is not the function and F < +oo 
(otherwise {T''F)k>i trivially alternates between the and 1 functions). Then, there exists a constant 
7 G M dependent on F such that T'^^F > 7 and T'^'^+^F > —7, uniformly on M. In particular, 

k — ^00 k — >oo 

T^kp ^ 0^p* p2k+ip ^ e^^fF*, uniformly on M. 

k — ^00 k — ^00 

Proof. By Lemma [5TT1 one can choose a large enough M > for T'^F to lie in the subspace 
Vm = {F eV, -M < F < M} = {F eV, O-mF* <F< OmF*}. 
Lemma guarantees the stability of Vm under the action of T, so the whole sequence {T^F)k>4, remains 

in T>M- Even better, the bounded real sequences (iufg r^fe^)^,^^ and (sup^ T2'^i^)fc>2 are monotone, 
hence convergent, say to 7" and 7"*" respectively. All we have to show is that 7" = 7+; convergence of 
{T^'^^^ F)k>2 to the opposite constant will then simply follow from property (jlip . 

By Arzela-Ascoli theorem, the family of (clearly bounded and 1-Lipschitz) functions {T'^^F)k>2 is 
relatively compact with respect to compact convergence. Thus, there exists a convergent sub-sequence: 

T^f(k)p , p^^ (^12) 

k — >oo 

From the uniform continuity of ?/ 1— > In on every compact subset of ]0, 1[ (Heine's theorem), it follows 
that the restriction of the transform to T>m is continuous with respect to compact convergence. Hence, 

T^vi^F > F^.. 

k — >oo 

Even better, the uniform integrability of variations stated in Lemma 15.31 makes the above compact con- 
vergence perfectly equivalent to uniform convergence on all K. In particular, 

inf^^= lim t inf T^Jwi^ ^ ^- and sup = lim j sup r2J(*0F = 7+ . (13) 

R fc— »oo R ]^ k^oc ^ 

On the other hand, a straightforward use of the the dominated convergence Theorem shows that the 
restriction of T to is continuous with respect to compact convergence. Therefore, (|12p implies 



p2{ip{k) + l) p ^ T^F 



00 ■ 



fc- 

But using exactly the same arguments as above (note that 7^ , 7+ do not depend on ip) , we obtain a 

similar conclusion : 

inf f2^^, = 7- and supf2^^ = 7+. (14) 

R R 

By the second part of Lemma 15. 2[ having both (fT5|) and (|14p implies that 7~ = 7^ . □ 
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5.2 Strong attractiveness. So far, we have established the distributional convergence of the mes- 
sage process. To complete the algorithm analysis, we now need to prove sample-path wise convergence. 
We note that Aldous and Bandyopadhyay [SI IS] have studied the special case where the i.i.d. initial mes- 
sages {{v u}^)„gv* are distributed according to the fixed point F*. They established L^-convergence 
of the message process to some unique stationary configuration which is independent of {{v —> u}^)i,gv*- 
They call this the bivariate uniqueness property. This sub-section is dedicated to extending such a prop- 
erty to the case of i^-distributed i.i.d. initial messages, where F is any tail distribution satisfying the 
assumption of Theorem 15. 11 namely: 



/•oo 

/ F < oo, OT equivalently E ((w v)^) < oo. (15 
Jo '- 



Recall that, if (fT5|) does not hold, then {T'^F)k>i simply alternates between the and 1 functions. In 
other words, all messages in T become almost surely infinite after the very first iteration. Henceforth, 
we will assume ([15]) to hold, which is in particular the case if all initial messages are set to zero. We first 
state a Lemma that will allow us to fix the problem of non-continuity of the update and decision rules 
on T caused by the infinite number of terms involved in the minimization. 

Lemma 5.4 Under assumption U5\) . 7r^(u) = argmin | ||ti;, — (w ^ w)^} is a.s. well defined for 
every fc > 4, u G V despite the infinite number of terms involved in the argmin. Moreover, 

supP ( argmin { \\v.i, — {v.i v)t} — *o I > 0. (16) 

fc>4 \ i>l J '0-+OO 



With this uniform control in hand, we are now ready to prove the strong convergence of BP on T. 

Theorem 5.2 (Convergence of BP on T) Assume the i.i.d. initial messages satisfy I73)j . Then, 
up to some additive constant 7 G M, the recursive tree process defined by (0j converges to the unique 
stationary configuration {■—^■)^ *^ ^'^s following sense: for every v G V* , 

(v ^ {j)f (y and {v ^ vf^+^ {v v)*r - 7. 

k — ^CJO k — >oo 

Further, defining tt^ as the assignment induced by {■^■)q- according to rule l^j), we have convergence of 
decisions at the root: 

^ri^) 

k — >oo 



Proof. Denote by F the tail distribution function of the initial messages. The idea is to construct an 
appropriate stochastic coupling between our initialized message process and the F* -initialized version 
and then use the endogeneity of the latter to conclude. We let 7 be the constant appearing in Theorem 
15.11 First, observe that the dynamics © are "anti-homogeneous": if we add the same constant to every 
initial message, then that constant is simply added to every even message {v v)lf and subtracted from 
every odd message (w w)^"''^. Therefore, without loss of generality we may assume 7 = 0. That is, for 
any e > there exists fc^ G N so that 

e^^F* < T^'F < e,F*. 

By a classical result often termed as Strassen's Theorem, probability measures satisfying such a stochastic 
ordering can always be coupled in a pointwisc monotone manner. Specifically, there exists a probability 
space E' = (f2', T' , P'), possibly differing from the original space E = {fl, T , P), on which can be defined 
a random variable with distribution T^^F and two random variables X~ and with distribution 
F* , in such a way that almost surely, 



X- - e < X^ < X+ + e. 



(17) 
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Now consider the product space ((2)i,gv ^ ^ '^^^'^ which we can jointly define the PWIT T and 
independent copies {X~ , X^, X^)y^\> of the triple {X~ , X, X~^) for each vertex v E V. On T, let us 
compare the configurations ((• ^ •)^ )k>o' i^' ^ ''^T^^)k>o ^^"^ ^ '^'7'^)fe>o ''^suiting from three 
different initial conditions, namely: 





r {v- 


■\0,- 


= X- 


yv e V*, 1 




■\0.s 






I (v- 


•\0,+ 





Due to anti-monotony and anti- homogeneity of the update rule Q, inequality (|17p 'propagates' in the 
sense that for any fc > and u e V* , 

{v ^ v)"^' — e < {v v)'^''^ < {v ^ v)"^'^ + e; 

I •\2A;+l,+ ^ I •\2fc+l,e ^ / ■\2/c+l,- , 



Now fix u G V* . By construction 

In particular, for every k > kg, we have 

sup ||(v ^ w)^ - (w ^ '())V||^2 = sup ^ ''))r^ - (w ^ 'i')r''IL2 

s,t>k s.t>k—ks 

< 2 sup II ^ i))^* — (u ^ £1)5-11^2 + 2e. 

But from the bivariate uniqueness property established by Aldous and Bandyopadhyay [31 [S] for the 
logistic distribution, it follows that 

sup \\{v'~>v)*^'^-{v^v)U\,2 * 0. 

t>k-k, k^ca 

Thus, the sequence ((u-^w)^)^,^^^ is Cauchy in L^, hence convergent. Using Lemma 15.41 to justify the 
interchange between limit and minimization, it is not hard to check that the limiting configuration has 
to be stationary, i.e. is a fixed point for the recursion and that the estimates 7r^,A; > do in 
turn converge (in probability) to the estimate tt^ associated with the limiting configuration. Note that 
endogeneity implies uniqueness of the stationary configuration, and therefore tt^ is nothing but the infinite 
optimal assignment studied in j^. □ 

6. Third step: putting things together. Finally, we are now in position to complete the proof 
of Theorem 12. 1[ using the following remarkable result by Aldous. 

Theorem 6.1 (Aldous, 0) Let tt^ be the assignment associated with the unique stationary configura- 
tion (•— Then vr^ is a perfect matching on T, and 

(/Cn„,^^ )-^(T,7rf). (18) 

Proof of Theorem 12.11 Using Theorem 14. II and Skorokhod's representation Theorem, the above 
convergence ^TE\\ can be extended to include BP's answer at any fixed step k: 

In particular, the probability of getting a wrong decision at the root of JCnn converges as n — > oo to the 
probability of getting a wrong decision at the root of T: for all fc > 0, 

P (4^,^ (0) / nl^^^ (0)) > P (4(0) ^ vrf (0)) . 

\ nn 7zn / n^OC ^ ^ 

Finally, the symmetry of /C„„ lets us rewrite the left-hand side as the expected fraction of errors 
E [d{Trj(^^^, "^Knn)] ' ^^'^ Theorem 15.21 ensures that the right-hand side vanishes as fc ^ cxo. □ 
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7. Conclusion. In this paper we have estabhshed that the BP algorithm finds an almost optimal 
solution to the random nx n assignment problem in time 0{n^) with high probability. The natural lower 
bound of ri(ri^) makes BP an (order) optimal algorithm. This result significantly improves over both the 
worst-case upper bound for exact convergence of the BP algorithm proved by Bayati, Shah and Sharma 
and the best-known computational time achieved by Edmonds and Karp's algorithm [lOj . Beyond 
the obvious practical interest of such an extremely efficient distributed algorithm for locally solving huge 
instances of the optimal assignment problem, we hope that the method used here - essentially replacing 
the asymptotic analysis of the algorithm as the size of the underlying graph tends to infinity by its exact 
study on the infinite limiting structure revealed via local weak convergence - will become a powerful tool 
in the fascinating quest for a general mathematical understanding of loopy BP. 
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Appendix A. Proof of Lemma 14.11 The proof of Lemma 14.11 lays upon two technical lemmas 
stated below. Essentially, the picture is the following: when i gets large, the length of the i*'* 

shortest edge attached to v in ICnn becomes large too fLemma lA.ip . whereas the message {v.i — *■ v)^ 



passing along that edge remains reasonably small fLemma IA.2p . Therefore, the resulting contribution 
— (v.i ^)k;„„ is too large to matter in the minimization. In what follows, |u| will denote 
the number of letters of the word w G V, and vi, . . . ,v\y\ its consecutive letters (e.g. ii v — 1.2.1.3 then 
\v\ — 4 and vi = 1,V2 ^ 2, = 1, U4 = 3). Also, we will write v<h for the prefix vi ■ ■ ■ Vh- 

Lemma A.l (Uniform control on edge weights) There exist constants {Mh)h>i, a and /3 > 
such that for aZ/ u £ V, i > 1, i G K+, and all n large enough for ICnn to contain v.i, 



< < A^|t,|^—pe"* and v{^\v,v.l\\^ > t^ 



>t]< M\y\e 



-at 



Proof. Suppose < t. Then by construction, the sequence of words (w<o, ■ . . , i'<|t,|) 

represents a path in /C„„ starting from the root and ending at a vertex from which at least i incident 
edges have length at most t. Following down this path and deleting every cycle we meet, we obtain a 
cycle- free path x — {xq, . . . ,Xk) {0 < k < |w| A 2rt — 1) starting from the root and satisfying 

card|y ^ Xk,y ^ Xk-i, \\xk,y\\f^^^^ <t^>i-l. (19) 

For < j < fc, (xj,Xj-f.i) corresponds to some {v<p-i,v<p), 1 < p < ji'j. By definition of our relabeling, 
the number of edges in ICnn that are incident to w<p-i and shorter than {w<p_i, w<p} is precisely Vp — 1 
or Vp, depending on the parent-edge. Therefore, there exists p € {1, . . . , such that 



< card|?/ ^ {xi, . . ■,Xk}, \\xj,y\\j^^^^ < \\xj , Xj+i\\j^^J^ < Vp. (20) 



The [|] above comes from the fact that only half of the xi, . . . ,Xk are neighbors of Xj in ICnn- We thus 
have shown that 

\v\ / fe-1 

where the event An^x corresponds to ^T9\\ and the event B-j^^ to (|20|) . The summation in the above 
inequality is over all possible cycle- free paths x = (xq, ...Xk) starting from the root in ICnn- Now since all 
the edges involved are pairwise distinct, the events An^x, xj ---j Bn~x independent. Moreover, 



q=i— 1 



where we have used assumption Al to define a = jjtjqj supggjj+ ^^-^ < +00. This yields the first bound 
since there are less than n'^ cycle- free paths x = {xq, ---,Xk) starting from the root in ICnn- For the second 
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one, the event An^x is simply replaced by card|2/ ~ Xk, y 7^ Xk-i, \\xk, £ ^| £ 1, whose probability 

is straightforwardly exponentiaUy bounded using assumption A2. □ 

Lemma A. 2 (Uniform control on messages) There exist constants {Mkji, l3k,h)k,h>o > such that 
for all V ^ V* and t G , uniformly in n (as long as n is large enough so that v G K-nn), 

P(|(«^*)LJ >*) < Mfe>|e-'''=-i"i*. (21) 

Proof. The proof is by induction over k. The base case of fc = follows trivially. Now, assume (PT|) 
is true for a given fc G N. By Lemma [A. II we can write for all w G V* and t G M"*": 

P(^(w ^ > = P(^^min - (w ^ > 

< P(||t;,i;.l||^_ > ^) +P((i;.l-.t;)^_ < 



fit ^k,\v\ + l . 

< Af|,|e-2*+Mfe^|„|+ie 
The other side is slightly harder to obtain. Again by Lemma lA.ll : 



n— 1 n—1 



4=1 1=1 

(an 



^ rrvr.-(t))V"'-'(*) 



i=l 1=1 



where the inequalities hold for any choice of the quantities ri{t) > 0. Our proof thus boils down to the 
following simple question: can we choose the ri{t) such that 



00 



e 
1=1 



(ar,(t))'e"'^' 



(i) ri{t) is large enough to ensure exponential vanishing of f(t) = 

1=1 
00 

(ii) ri{t) is small enough to ensure exponential vanishing of g{t) — 

i=l 

The answer is yes. Indeed, taking ri{t) = Sie""'* with 7, (5 > yields 

1 , , 1.x 1 ^ (a6e°'^i 

log/(0 > 7 - /3fc,|i.|+i and -log5(t) < -7+-log2^ ' 



Therefore, choosing any 7 < /3fej^|+i is enough to ensure (i), and taking b small enough for a(5e"*-i < 1 



i=l 

:(5s: 

will guarantee (ii) since the right-hand summand is equivalent to '■"'^'l . by Stirling's formula. □ 



1(5-1 

/27rl 

We now know enough to prove Lemma |4. II 

Proof of Lemma W7\\ Set (5 > smah enough to ensure abe°-^~^ < 1. Then, for t G R"*", 

P ( argmin < llw, w.ilL — {v ^ v.i)'^ r > io ) 
V i<i<n ""J / 

n—1 n—1 

< P((t; ^ > i) + ^P(||w,w.i||^ <(5i) +^P((w.i^w)^^^^ ><5i-t) 

i=io 'i=io 
i=io j=io 

by Lemmas lA.ll and IA.2I Letting io — > 00 and finally t ^ 00 yields the desired result. □ 
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Appendix B. Proof of the Lemmas in Section [5j Here we prove Lemmas 15. 11 15.21 15.31 and l5. 41 

Proof of Lemma [?m From J^°° F < oo and the definition of TF: x i-^ e~ ■^-'^ ^ , it follows that: 

(i) as X +00, TF{x) = 6 (^e^ ■1'-° for any fixed xq G M; 

(ii) as a; -oo, TF{x) = 1-9 [j^^ f] 



Now since F is non-zero and non- increasing, there exists a, P > (simply take P — I) such that, for all 
small enough a; G M, a < F{x) < (3. Replacing these inequalities into (i) above yields : 

as a; ^ +oo, TF{x) = 0(6"""=) and TF{x) = ^2(6"'^^). (22) 

In particular, TF satisfies the assumptions made on F, so by induction T'^F, k >2 also do, and we may 
therefore iteratively apply (i)/(ii) to TF, T^F and T^F. This successively yields: 

asx^-cx), T'^F{x) = l-0{e°"') andT'^F{x) = l-n{e'^''); (23) 

asx^+oo, r3F(a;) = e(e^^); (24) 

as X ^ -oo, T^F{x) = 1 - e(e^). (25) 

Replacing F by TF, we see that (p4l) also holds for T^F, so we end up with T^F{x) being both 9(e^^) 
as a: — > +00 and 1 — 9(e^) as a; — > —00. Besides, on any compact set, T^F takes values within a compact 

subset of ]0, 1[ by monotonicity. Hence the boundedness of T'^F : x ^ a; + In (^]-rfTp|^^ over M. □ 

Proof of Lemma 15.21 It follows from the properties PU)) and pl|) of T that for every m, M e M, 
0-^F* <F< 9mF* =^ O^mF* <TF< O-mF*. 
Once rewritten in terms of the '^transform, this becomes: 

m<F<M=^ -M < TF < -m, 

and the desired inequalities follow by taking m = infj} F and M = supjj F . Now assume F is not constant 
on M. The right-continuity (of F and hence) of F ensures existence of an open interval (a, h) such that 
M' — s\rp(a,b) F < supjj F — M. Then, for x > —a, 

TF{x) = cxp (- r f\ > exp I - / 9m F* - [ Om'F* - 




= Kxe^MF*{x) with K = exp| / {OmF* -eM'F*)] >l. 
Applying T again implies that for every a; G M, 

x<a^ T^F{x) < exp ( f O^mF* ] = {eMF*{x)Y- 



— X 



x>a^ T^F{x) < exp ^- J O-mF* - k J 9-mF*^ = h' x OmF^x), 

where k' = {d]\jF* (a))'^ ^ < 1. Now, simply observing that both {9mF*{x))'^ and k' x 9mF*{x) are 
strictly less than 9mF*{x) is already enough for claiming that T'^F{x) < M for all x G K. In order to 
conclude that sup^ T^F < M, we only need to check that the inequality remains strict at ±00: 

x<a^ f^F{x) <x + \n{ >M-\nn<M; 

\1 -(9mF*{x)) ) 

x> T^Flx) < a; + In | ^ ^ , | > M + Iuk' < M. 

~ \l - k' X 9mF*{x) J 2:^+00 

The inequality iufg T'^F > iufg F can be obtained in exactly the same way; we skip the details. □ 
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Proof of Lemma 15.31 Fix fc > 2. From the inequality je"^ - < |a — 6| for all a, 6 < 0, and the 
fact that < t'^-'^F < 1, it follows that T'^-'^F: x i-^ exp (- /^°° T'^'^^^) is Lipschitz continuous with 
Lipschitz constant 1. Therefore, T^F: x ^ exp (— t'^^^F) is differentiable on M and for all x G R, 

{T^F)'{x) = -T^F{x)T^-^F{-x). 

T''F{x) 



Hence, T'^F : 



In 



l-T''F(x) 

(ffe>)'(a;) 



is (continuously) differentiable on 

l-T''F{x) - T^-^F{-x) 



and for all x G 



(26) 



1 - T''F{x) 

It now remains to check the uniform integrability of {{T^F)' ,k > 3}. Recall that Lemma [5.21 ensures 
uniform boundedness of the family {T'^F, fc > 0}. In other words, there exists M > such that: 

Vfc > 0, e_MF* < T^F < OmF*. 

PlXl^^m^ it into '^^^^ "i^T ^i/"! 1 o ir Tnj:il/-lo ^-Vi/:* nni i- m V\ in /-I \ f TP\^ { rvW 6 1 



immediately yields the uniform bound |(T i^)'(a;)| < ^"^g^+M , which is enough 
for uniform integrability on (0, +00). For (— oo,0) now, observe that the numerator in (I26p vanishes as 
a; ^ — 00 and is a continuously differentiable function of x as soon as fc > 3, with derivative 



T^-^F{^-x){T^F{x)-T^-'^F(x)) 



Therefore, for all fc > 3 and x G M, 

\\-T^F(x)-T^-^F{-x)\ 



< 



I t''-^F{~-u){T''F{u) - T''-^F{u)) du 

J —00 

BMF*{-u){eMF*(u) - e^MF*{u))du. 



Now, the above integrand is 0(e^") as u ^ —00, so the integral is 0{e^^) as a; — > —00, whereas the 
denominator in (|26p remains always above 1 — 9mF*{x) = O(e^) as a; — > —00. Thus the resulting bound 
on supj,>3 \{T^F)'\{x) is O(e^) as a; — > —00, which is enough for imiform integrability on (— oo,0). □ 

Proof of Lemma 15.41 By the definition of T and the fact that the fc— step messages sent to v by 
all its children are i.i.d., we find that for every « > 2, 



(27) 



where Xk and Yj- are i.i.d. with distribution T^F and (Ci)i>i is a Poisson point process with rate 1 
independent of X^, Yk- Now, observe that 



Y.m^<Xk-Yk) 

i=l 



i=l 



dx 



E 



{Xk - Ykf] < E [\X*\] + sup 



(28) 

where X* is an _F*— distributed random variable. It follows from the previous sub-section that 
supr |T'=F| < +CX) as soon as fc > 4, so we can apply the Borcl-CantcUi Lemma to get that 

— {v.i v)^ > \\v.l,v\\j. — {v.l v)^ for all large enough i 

with probability one, and hence the argmin is well defined. Even better, the boundedness of T^F derived 
in the previous sub-section is in fact uniform in fc > 4, and this is enough for pop to hold. □ 



